tests: add integration test suite for reindexing #28

conorsch · 2025-02-16T21:23:01Z

Adds integration tests that fetch historical node archives from known URLs, unpack them, and build a reindexer_archive.bin containing all historical blocks for the target chain. Only exercises the penumbra-reindexer archive behavior; doesn't yet exercise the penumbra-reindexer regen functionality. Future work can implement that.

For now, it's enough to have a reproducible process that confirms all historical chain state (for the popular chains like penumbra-1 and penumbra-testnet-phobos-2) can be collected to form a complete picture of event data. This reproducible process does rely on the existence of external data sources, in the form of historical archives in a known format at known URLs. But at least now we can verify that's the case, with known checksums, as well.

Testing and review

You can pull down this branch and run just integration; that'll do needful:

download gzipped tar archives of historical node state
generate an ephemeral node0 directory with key material
clobber generated genesis file with the original genesis file for the target chain
extract archive over node0 directory
run penumbra-reindexer archive to create an sqlite3 database
verify that no gaps are present in the sqlite3 database, i.e. all blocks are accounted for
repeat for multiple chains (currently only penumbra-1 and penumbra-testnet-phobos-2 are handled)

However, be aware that doing so is a disk- and bandwidth-intensive process. On my machine, the process takes ~60-90m, and results in about 500GB of diskspace being used.

By default, the app now logs at INFO level, without requiring an opt-in. Accordingly, I've dialed down the load frequency on `penumbra-reindexer archive` to log every 10,000th block, rather than every thousand.

Adds integration tests that fetch historical node archives from known URLs, unpack them, and build a `reindexer_archive.bin` containing all historical blocks for the target chain.

cronokirby

This looks really cool!

The PD crashing thing is really a shame, because it would be nice to use the data structure we have representing what the artifacts look like to also figure out what the two integration test splits need to look like, but there's no good way to encode that, unfortunately.

I think subsequently it would be nice to refactor things a bit so that way pointing at artifacts.plinfra.net isn't hardcoded, but that can be a follow up.

conorsch · 2025-02-26T16:40:21Z

The PD crashing thing is really a shame

Let's at least look to removing the sys-exit behavior in pd going forward. I'm skeptical it'll be feasible to backport revising the halt behavior for historical versions, but it's on the table.

to also figure out what the two integration test splits need to look like

My thoughts exactly, it was tempting to make things DRY enough to generate plans from the archive declarations, but we're not quite there yet.

conorsch added 4 commits February 16, 2025 13:10

feat: more logging by default

97b5f8c

By default, the app now logs at INFO level, without requiring an opt-in. Accordingly, I've dialed down the load frequency on `penumbra-reindexer archive` to log every 10,000th block, rather than every thousand.

docs: clarify --home argument path docs

13bf6be

chore: bump cometbft to 0.37.15

2c71ded

tests: add integration test suite for reindexing

d7b3b18

Adds integration tests that fetch historical node archives from known URLs, unpack them, and build a `reindexer_archive.bin` containing all historical blocks for the target chain.

conorsch requested a review from cronokirby February 16, 2025 21:23

cronokirby approved these changes Feb 20, 2025

View reviewed changes

conorsch merged commit 589bc85 into main Feb 26, 2025
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tests: add integration test suite for reindexing #28

tests: add integration test suite for reindexing #28

conorsch commented Feb 16, 2025

cronokirby left a comment

conorsch commented Feb 26, 2025

tests: add integration test suite for reindexing #28

tests: add integration test suite for reindexing #28

Conversation

conorsch commented Feb 16, 2025

Testing and review

cronokirby left a comment

Choose a reason for hiding this comment

conorsch commented Feb 26, 2025